Overview
Brought to you by YData
Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 4600 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 575.1 KiB |
| Average record size in memory | 128.0 B |
Variable types
| DateTime | 1 |
|---|---|
| Numeric | 12 |
| Categorical | 3 |
Age is highly overall correlated with bathrooms and 2 other fields | High correlation |
Total_sqft is highly overall correlated with bathrooms and 5 other fields | High correlation |
bathrooms is highly overall correlated with Age and 6 other fields | High correlation |
bedrooms is highly overall correlated with Total_sqft and 3 other fields | High correlation |
floors is highly overall correlated with Age and 3 other fields | High correlation |
price is highly overall correlated with Total_sqft and 2 other fields | High correlation |
sqft_above is highly overall correlated with Total_sqft and 5 other fields | High correlation |
sqft_basement is highly overall correlated with Total_sqft | High correlation |
sqft_living is highly overall correlated with Total_sqft and 4 other fields | High correlation |
yr_built is highly overall correlated with Age and 2 other fields | High correlation |
waterfront is highly imbalanced (93.9%) | Imbalance |
view is highly imbalanced (71.9%) | Imbalance |
price is highly skewed (γ1 = 24.79093256) | Skewed |
price has 49 (1.1%) zeros | Zeros |
sqft_basement has 2745 (59.7%) zeros | Zeros |
yr_renovated has 2735 (59.5%) zeros | Zeros |
Reproduction
| Analysis started | 2024-11-17 14:41:54.950587 |
|---|---|
| Analysis finished | 2024-11-17 14:42:24.355539 |
| Duration | 29.4 seconds |
| Software version | ydata-profiling vv4.12.0 |
| Download configuration | config.json |
Variables
date
Date
| Distinct | 70 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 36.1 KiB |
| Minimum | 2014-05-02 00:00:00 |
|---|---|
| Maximum | 2014-07-10 00:00:00 |
price
Real number (ℝ)
High correlation  Skewed  Zeros 
| Distinct | 1741 |
|---|---|
| Distinct (%) | 37.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 551962.99 |
| Minimum | 0 |
|---|---|
| Maximum | 26590000 |
| Zeros | 49 |
| Zeros (%) | 1.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 200000 |
| Q1 | 322875 |
| median | 460943.46 |
| Q3 | 654962.5 |
| 95-th percentile | 1184050 |
| Maximum | 26590000 |
| Range | 26590000 |
| Interquartile range (IQR) | 332087.5 |
Descriptive statistics
| Standard deviation | 563834.7 |
|---|---|
| Coefficient of variation (CV) | 1.0215082 |
| Kurtosis | 1044.3522 |
| Mean | 551962.99 |
| Median Absolute Deviation (MAD) | 157500 |
| Skewness | 24.790933 |
| Sum | 2.5390297 × 109 |
| Variance | 3.1790957 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 49 | 1.1% |
| 300000 | 42 | 0.9% |
| 400000 | 31 | 0.7% |
| 600000 | 29 | 0.6% |
| 450000 | 29 | 0.6% |
| 440000 | 29 | 0.6% |
| 350000 | 28 | 0.6% |
| 250000 | 27 | 0.6% |
| 550000 | 27 | 0.6% |
| 415000 | 27 | 0.6% |
| Other values (1731) | 4282 |
| Value | Count | Frequency (%) |
| 0 | 49 | |
| 7800 | 1 | < 0.1% |
| 80000 | 1 | < 0.1% |
| 83000 | 1 | < 0.1% |
| 83300 | 2 | < 0.1% |
| 84350 | 1 | < 0.1% |
| 87500 | 1 | < 0.1% |
| 90000 | 2 | < 0.1% |
| 100000 | 4 | 0.1% |
| 102500 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 26590000 | 1 | |
| 12899000 | 1 | |
| 7062500 | 1 | |
| 4668000 | 1 | |
| 4489000 | 1 | |
| 3800000 | 1 | |
| 3710000 | 1 | |
| 3200000 | 1 | |
| 3100000 | 1 | |
| 3000000 | 1 |
bedrooms
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.4008696 |
| Minimum | 0 |
|---|---|
| Maximum | 9 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.90884812 |
|---|---|
| Coefficient of variation (CV) | 0.26723992 |
| Kurtosis | 1.2353774 |
| Mean | 3.4008696 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.45644663 |
| Sum | 15644 |
| Variance | 0.8260049 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 2032 | |
| 4 | 1531 | |
| 2 | 566 | 12.3% |
| 5 | 353 | 7.7% |
| 6 | 61 | 1.3% |
| 1 | 38 | 0.8% |
| 7 | 14 | 0.3% |
| 8 | 2 | < 0.1% |
| 0 | 2 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 2 | < 0.1% |
| 1 | 38 | 0.8% |
| 2 | 566 | 12.3% |
| 3 | 2032 | |
| 4 | 1531 | |
| 5 | 353 | 7.7% |
| 6 | 61 | 1.3% |
| 7 | 14 | 0.3% |
| 8 | 2 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9 | 1 | < 0.1% |
| 8 | 2 | < 0.1% |
| 7 | 14 | 0.3% |
| 6 | 61 | 1.3% |
| 5 | 353 | 7.7% |
| 4 | 1531 | |
| 3 | 2032 | |
| 2 | 566 | 12.3% |
| 1 | 38 | 0.8% |
| 0 | 2 | < 0.1% |
bathrooms
Real number (ℝ)
High correlation 
| Distinct | 26 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.1608152 |
| Minimum | 0 |
|---|---|
| Maximum | 8 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1.75 |
| median | 2.25 |
| Q3 | 2.5 |
| 95-th percentile | 3.5 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 0.75 |
Descriptive statistics
| Standard deviation | 0.78378107 |
|---|---|
| Coefficient of variation (CV) | 0.36272471 |
| Kurtosis | 1.8659047 |
| Mean | 2.1608152 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 0.61603272 |
| Sum | 9939.75 |
| Variance | 0.61431277 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.5 | 1189 | |
| 1 | 743 | |
| 1.75 | 629 | |
| 2 | 427 | 9.3% |
| 2.25 | 419 | 9.1% |
| 1.5 | 291 | 6.3% |
| 2.75 | 276 | 6.0% |
| 3 | 167 | 3.6% |
| 3.5 | 162 | 3.5% |
| 3.25 | 136 | 3.0% |
| Other values (16) | 161 | 3.5% |
| Value | Count | Frequency (%) |
| 0 | 2 | < 0.1% |
| 0.75 | 17 | 0.4% |
| 1 | 743 | |
| 1.25 | 3 | 0.1% |
| 1.5 | 291 | 6.3% |
| 1.75 | 629 | |
| 2 | 427 | 9.3% |
| 2.25 | 419 | 9.1% |
| 2.5 | 1189 | |
| 2.75 | 276 | 6.0% |
| Value | Count | Frequency (%) |
| 8 | 1 | < 0.1% |
| 6.75 | 1 | < 0.1% |
| 6.5 | 1 | < 0.1% |
| 6.25 | 2 | < 0.1% |
| 5.75 | 1 | < 0.1% |
| 5.5 | 4 | 0.1% |
| 5.25 | 4 | 0.1% |
| 5 | 6 | 0.1% |
| 4.75 | 7 | 0.2% |
| 4.5 | 29 |
sqft_living
Real number (ℝ)
High correlation 
| Distinct | 566 |
|---|---|
| Distinct (%) | 12.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2139.347 |
| Minimum | 370 |
|---|---|
| Maximum | 13540 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 370 |
|---|---|
| 5-th percentile | 950 |
| Q1 | 1460 |
| median | 1980 |
| Q3 | 2620 |
| 95-th percentile | 3870 |
| Maximum | 13540 |
| Range | 13170 |
| Interquartile range (IQR) | 1160 |
Descriptive statistics
| Standard deviation | 963.20692 |
|---|---|
| Coefficient of variation (CV) | 0.45023408 |
| Kurtosis | 8.2916826 |
| Mean | 2139.347 |
| Median Absolute Deviation (MAD) | 570 |
| Skewness | 1.7235133 |
| Sum | 9840996 |
| Variance | 927767.56 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1940 | 32 | 0.7% |
| 1720 | 32 | 0.7% |
| 1840 | 31 | 0.7% |
| 1660 | 31 | 0.7% |
| 2000 | 30 | 0.7% |
| 1410 | 29 | 0.6% |
| 1200 | 28 | 0.6% |
| 1480 | 28 | 0.6% |
| 1890 | 27 | 0.6% |
| 1490 | 27 | 0.6% |
| Other values (556) | 4305 |
| Value | Count | Frequency (%) |
| 370 | 1 | |
| 380 | 1 | |
| 420 | 1 | |
| 430 | 1 | |
| 490 | 1 | |
| 520 | 1 | |
| 550 | 1 | |
| 560 | 1 | |
| 580 | 1 | |
| 590 | 2 |
| Value | Count | Frequency (%) |
| 13540 | 1 | |
| 10040 | 1 | |
| 9640 | 1 | |
| 8670 | 1 | |
| 8020 | 1 | |
| 7320 | 1 | |
| 7270 | 1 | |
| 7050 | 1 | |
| 6980 | 1 | |
| 6900 | 1 |
sqft_lot
Real number (ℝ)
| Distinct | 3113 |
|---|---|
| Distinct (%) | 67.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14852.516 |
| Minimum | 638 |
|---|---|
| Maximum | 1074218 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 638 |
|---|---|
| 5-th percentile | 1690.8 |
| Q1 | 5000.75 |
| median | 7683 |
| Q3 | 11001.25 |
| 95-th percentile | 43560 |
| Maximum | 1074218 |
| Range | 1073580 |
| Interquartile range (IQR) | 6000.5 |
Descriptive statistics
| Standard deviation | 35884.436 |
|---|---|
| Coefficient of variation (CV) | 2.416051 |
| Kurtosis | 219.87299 |
| Mean | 14852.516 |
| Median Absolute Deviation (MAD) | 2772 |
| Skewness | 11.307139 |
| Sum | 68321574 |
| Variance | 1.2876928 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5000 | 80 | 1.7% |
| 6000 | 65 | 1.4% |
| 4000 | 54 | 1.2% |
| 7200 | 50 | 1.1% |
| 4800 | 29 | 0.6% |
| 4500 | 25 | 0.5% |
| 9600 | 25 | 0.5% |
| 5500 | 23 | 0.5% |
| 3000 | 23 | 0.5% |
| 7500 | 23 | 0.5% |
| Other values (3103) | 4203 |
| Value | Count | Frequency (%) |
| 638 | 1 | |
| 681 | 1 | |
| 704 | 1 | |
| 746 | 1 | |
| 747 | 1 | |
| 750 | 1 | |
| 779 | 1 | |
| 833 | 1 | |
| 835 | 1 | |
| 844 | 2 |
| Value | Count | Frequency (%) |
| 1074218 | 1 | |
| 641203 | 1 | |
| 478288 | 1 | |
| 435600 | 2 | |
| 423838 | 1 | |
| 389126 | 1 | |
| 327135 | 1 | |
| 307752 | 1 | |
| 306848 | 1 | |
| 284011 | 1 |
floors
Real number (ℝ)
High correlation 
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.5120652 |
| Minimum | 1 |
|---|---|
| Maximum | 3.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1.5 |
| Q3 | 2 |
| 95-th percentile | 2 |
| Maximum | 3.5 |
| Range | 2.5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.53828838 |
|---|---|
| Coefficient of variation (CV) | 0.35599548 |
| Kurtosis | -0.53885198 |
| Mean | 1.5120652 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 0.55144065 |
| Sum | 6955.5 |
| Variance | 0.28975438 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 2174 | |
| 2 | 1811 | |
| 1.5 | 444 | 9.7% |
| 3 | 128 | 2.8% |
| 2.5 | 41 | 0.9% |
| 3.5 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 2174 | |
| 1.5 | 444 | 9.7% |
| 2 | 1811 | |
| 2.5 | 41 | 0.9% |
| 3 | 128 | 2.8% |
| 3.5 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 3.5 | 2 | < 0.1% |
| 3 | 128 | 2.8% |
| 2.5 | 41 | 0.9% |
| 2 | 1811 | |
| 1.5 | 444 | 9.7% |
| 1 | 2174 |
waterfront
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 36.1 KiB |
| 0 | |
|---|---|
| 1 | 33 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 4567 | |
| 1 | 33 | 0.7% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 4567 | |
| 1 | 33 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4567 | |
| 1 | 33 | 0.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4600 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4567 | |
| 1 | 33 | 0.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4600 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 4567 | |
| 1 | 33 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4600 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 4567 | |
| 1 | 33 | 0.7% |
view
Categorical
Imbalance 
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 36.1 KiB |
| 0 | |
|---|---|
| 2 | 205 |
| 3 | 116 |
| 4 | 70 |
| 1 | 69 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 4 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 4140 | |
| 2 | 205 | 4.5% |
| 3 | 116 | 2.5% |
| 4 | 70 | 1.5% |
| 1 | 69 | 1.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 4140 | |
| 2 | 205 | 4.5% |
| 3 | 116 | 2.5% |
| 4 | 70 | 1.5% |
| 1 | 69 | 1.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4140 | |
| 2 | 205 | 4.5% |
| 3 | 116 | 2.5% |
| 4 | 70 | 1.5% |
| 1 | 69 | 1.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4600 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4140 | |
| 2 | 205 | 4.5% |
| 3 | 116 | 2.5% |
| 4 | 70 | 1.5% |
| 1 | 69 | 1.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4600 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 4140 | |
| 2 | 205 | 4.5% |
| 3 | 116 | 2.5% |
| 4 | 70 | 1.5% |
| 1 | 69 | 1.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4600 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 4140 | |
| 2 | 205 | 4.5% |
| 3 | 116 | 2.5% |
| 4 | 70 | 1.5% |
| 1 | 69 | 1.5% |
condition
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 36.1 KiB |
| 3 | |
|---|---|
| 4 | |
| 5 | |
| 2 | 32 |
| 1 | 6 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 5 |
| 3rd row | 4 |
| 4th row | 4 |
| 5th row | 4 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 2875 | |
| 4 | 1252 | |
| 5 | 435 | 9.5% |
| 2 | 32 | 0.7% |
| 1 | 6 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3 | 2875 | |
| 4 | 1252 | |
| 5 | 435 | 9.5% |
| 2 | 32 | 0.7% |
| 1 | 6 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 2875 | |
| 4 | 1252 | |
| 5 | 435 | 9.5% |
| 2 | 32 | 0.7% |
| 1 | 6 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4600 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 2875 | |
| 4 | 1252 | |
| 5 | 435 | 9.5% |
| 2 | 32 | 0.7% |
| 1 | 6 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4600 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 3 | 2875 | |
| 4 | 1252 | |
| 5 | 435 | 9.5% |
| 2 | 32 | 0.7% |
| 1 | 6 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4600 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3 | 2875 | |
| 4 | 1252 | |
| 5 | 435 | 9.5% |
| 2 | 32 | 0.7% |
| 1 | 6 | 0.1% |
sqft_above
Real number (ℝ)
High correlation 
| Distinct | 511 |
|---|---|
| Distinct (%) | 11.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1827.2654 |
| Minimum | 370 |
|---|---|
| Maximum | 9410 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 370 |
|---|---|
| 5-th percentile | 860 |
| Q1 | 1190 |
| median | 1590 |
| Q3 | 2300 |
| 95-th percentile | 3440 |
| Maximum | 9410 |
| Range | 9040 |
| Interquartile range (IQR) | 1110 |
Descriptive statistics
| Standard deviation | 862.16898 |
|---|---|
| Coefficient of variation (CV) | 0.47183565 |
| Kurtosis | 4.0701383 |
| Mean | 1827.2654 |
| Median Absolute Deviation (MAD) | 490 |
| Skewness | 1.4942107 |
| Sum | 8405421 |
| Variance | 743335.34 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1010 | 47 | 1.0% |
| 1200 | 47 | 1.0% |
| 1300 | 45 | 1.0% |
| 1140 | 44 | 1.0% |
| 1320 | 43 | 0.9% |
| 1150 | 42 | 0.9% |
| 1180 | 40 | 0.9% |
| 1090 | 40 | 0.9% |
| 1400 | 38 | 0.8% |
| 1050 | 37 | 0.8% |
| Other values (501) | 4177 |
| Value | Count | Frequency (%) |
| 370 | 1 | < 0.1% |
| 380 | 1 | < 0.1% |
| 420 | 1 | < 0.1% |
| 430 | 1 | < 0.1% |
| 490 | 1 | < 0.1% |
| 520 | 1 | < 0.1% |
| 550 | 3 | |
| 560 | 1 | < 0.1% |
| 580 | 1 | < 0.1% |
| 590 | 2 |
| Value | Count | Frequency (%) |
| 9410 | 1 | |
| 8020 | 1 | |
| 7680 | 1 | |
| 7320 | 1 | |
| 6640 | 1 | |
| 6430 | 1 | |
| 6420 | 1 | |
| 6120 | 1 | |
| 6070 | 1 | |
| 6050 | 1 |
sqft_basement
Real number (ℝ)
High correlation  Zeros 
| Distinct | 207 |
|---|---|
| Distinct (%) | 4.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 312.08152 |
| Minimum | 0 |
|---|---|
| Maximum | 4820 |
| Zeros | 2745 |
| Zeros (%) | 59.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 610 |
| 95-th percentile | 1210 |
| Maximum | 4820 |
| Range | 4820 |
| Interquartile range (IQR) | 610 |
Descriptive statistics
| Standard deviation | 464.13723 |
|---|---|
| Coefficient of variation (CV) | 1.4872307 |
| Kurtosis | 4.08238 |
| Mean | 312.08152 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.6427322 |
| Sum | 1435575 |
| Variance | 215423.37 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2745 | |
| 500 | 53 | 1.2% |
| 600 | 45 | 1.0% |
| 800 | 43 | 0.9% |
| 900 | 41 | 0.9% |
| 700 | 38 | 0.8% |
| 1000 | 33 | 0.7% |
| 400 | 33 | 0.7% |
| 550 | 27 | 0.6% |
| 750 | 26 | 0.6% |
| Other values (197) | 1516 |
| Value | Count | Frequency (%) |
| 0 | 2745 | |
| 20 | 1 | < 0.1% |
| 50 | 1 | < 0.1% |
| 60 | 2 | < 0.1% |
| 65 | 1 | < 0.1% |
| 70 | 1 | < 0.1% |
| 80 | 3 | 0.1% |
| 90 | 2 | < 0.1% |
| 100 | 14 | 0.3% |
| 110 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 4820 | 1 | |
| 4130 | 1 | |
| 2850 | 1 | |
| 2730 | 1 | |
| 2550 | 2 | |
| 2360 | 1 | |
| 2330 | 1 | |
| 2300 | 1 | |
| 2200 | 1 | |
| 2180 | 1 |
yr_built
Real number (ℝ)
High correlation 
| Distinct | 115 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1970.7863 |
| Minimum | 1900 |
|---|---|
| Maximum | 2014 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 1900 |
|---|---|
| 5-th percentile | 1913 |
| Q1 | 1951 |
| median | 1976 |
| Q3 | 1997 |
| 95-th percentile | 2009 |
| Maximum | 2014 |
| Range | 114 |
| Interquartile range (IQR) | 46 |
Descriptive statistics
| Standard deviation | 29.731848 |
|---|---|
| Coefficient of variation (CV) | 0.015086287 |
| Kurtosis | -0.6700759 |
| Mean | 1970.7863 |
| Median Absolute Deviation (MAD) | 23 |
| Skewness | -0.50215519 |
| Sum | 9065617 |
| Variance | 883.98281 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2006 | 111 | 2.4% |
| 2005 | 104 | 2.3% |
| 2007 | 93 | 2.0% |
| 2004 | 92 | 2.0% |
| 1978 | 90 | 2.0% |
| 2008 | 89 | 1.9% |
| 2003 | 89 | 1.9% |
| 1967 | 82 | 1.8% |
| 1977 | 80 | 1.7% |
| 2014 | 78 | 1.7% |
| Other values (105) | 3692 |
| Value | Count | Frequency (%) |
| 1900 | 22 | |
| 1901 | 9 | 0.2% |
| 1902 | 10 | 0.2% |
| 1903 | 10 | 0.2% |
| 1904 | 9 | 0.2% |
| 1905 | 19 | |
| 1906 | 27 | |
| 1907 | 12 | |
| 1908 | 19 | |
| 1909 | 22 |
| Value | Count | Frequency (%) |
| 2014 | 78 | |
| 2013 | 57 | |
| 2012 | 33 | 0.7% |
| 2011 | 24 | 0.5% |
| 2010 | 28 | 0.6% |
| 2009 | 50 | |
| 2008 | 89 | |
| 2007 | 93 | |
| 2006 | 111 | |
| 2005 | 104 |
yr_renovated
Real number (ℝ)
Zeros 
| Distinct | 60 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 808.60826 |
| Minimum | 0 |
|---|---|
| Maximum | 2014 |
| Zeros | 2735 |
| Zeros (%) | 59.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1999 |
| 95-th percentile | 2011 |
| Maximum | 2014 |
| Range | 2014 |
| Interquartile range (IQR) | 1999 |
Descriptive statistics
| Standard deviation | 979.41454 |
|---|---|
| Coefficient of variation (CV) | 1.2112349 |
| Kurtosis | -1.8511109 |
| Mean | 808.60826 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.3859187 |
| Sum | 3719598 |
| Variance | 959252.83 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2735 | |
| 2000 | 170 | 3.7% |
| 2003 | 151 | 3.3% |
| 2009 | 109 | 2.4% |
| 2001 | 109 | 2.4% |
| 2005 | 95 | 2.1% |
| 2004 | 77 | 1.7% |
| 2014 | 72 | 1.6% |
| 2006 | 68 | 1.5% |
| 2013 | 61 | 1.3% |
| Other values (50) | 953 | 20.7% |
| Value | Count | Frequency (%) |
| 0 | 2735 | |
| 1912 | 33 | 0.7% |
| 1913 | 1 | < 0.1% |
| 1923 | 57 | 1.2% |
| 1934 | 6 | 0.1% |
| 1945 | 7 | 0.2% |
| 1948 | 1 | < 0.1% |
| 1953 | 1 | < 0.1% |
| 1954 | 8 | 0.2% |
| 1955 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 2014 | 72 | |
| 2013 | 61 | |
| 2012 | 45 | |
| 2011 | 54 | |
| 2010 | 30 | 0.7% |
| 2009 | 109 | |
| 2008 | 45 | |
| 2007 | 7 | 0.2% |
| 2006 | 68 | |
| 2005 | 95 |
Age
Real number (ℝ)
High correlation 
| Distinct | 115 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.213696 |
| Minimum | 10 |
|---|---|
| Maximum | 124 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 15 |
| Q1 | 27 |
| median | 48 |
| Q3 | 73 |
| 95-th percentile | 111 |
| Maximum | 124 |
| Range | 114 |
| Interquartile range (IQR) | 46 |
Descriptive statistics
| Standard deviation | 29.731848 |
|---|---|
| Coefficient of variation (CV) | 0.55872549 |
| Kurtosis | -0.6700759 |
| Mean | 53.213696 |
| Median Absolute Deviation (MAD) | 23 |
| Skewness | 0.50215519 |
| Sum | 244783 |
| Variance | 883.98281 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 18 | 111 | 2.4% |
| 19 | 104 | 2.3% |
| 17 | 93 | 2.0% |
| 20 | 92 | 2.0% |
| 46 | 90 | 2.0% |
| 16 | 89 | 1.9% |
| 21 | 89 | 1.9% |
| 57 | 82 | 1.8% |
| 47 | 80 | 1.7% |
| 10 | 78 | 1.7% |
| Other values (105) | 3692 |
| Value | Count | Frequency (%) |
| 10 | 78 | |
| 11 | 57 | |
| 12 | 33 | 0.7% |
| 13 | 24 | 0.5% |
| 14 | 28 | 0.6% |
| 15 | 50 | |
| 16 | 89 | |
| 17 | 93 | |
| 18 | 111 | |
| 19 | 104 |
| Value | Count | Frequency (%) |
| 124 | 22 | |
| 123 | 9 | 0.2% |
| 122 | 10 | 0.2% |
| 121 | 10 | 0.2% |
| 120 | 9 | 0.2% |
| 119 | 19 | |
| 118 | 27 | |
| 117 | 12 | |
| 116 | 19 | |
| 115 | 22 |
Total_sqft
Real number (ℝ)
High correlation 
| Distinct | 659 |
|---|---|
| Distinct (%) | 14.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2451.4285 |
| Minimum | 370 |
|---|---|
| Maximum | 17670 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 36.1 KiB |
Quantile statistics
| Minimum | 370 |
|---|---|
| 5-th percentile | 950 |
| Q1 | 1547.5 |
| median | 2260 |
| Q3 | 3042.5 |
| 95-th percentile | 4610.5 |
| Maximum | 17670 |
| Range | 17300 |
| Interquartile range (IQR) | 1495 |
Descriptive statistics
| Standard deviation | 1242.1942 |
|---|---|
| Coefficient of variation (CV) | 0.50672261 |
| Kurtosis | 10.006532 |
| Mean | 2451.4285 |
| Median Absolute Deviation (MAD) | 750 |
| Skewness | 1.8799238 |
| Sum | 11276571 |
| Variance | 1543046.5 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1150 | 28 | 0.6% |
| 1480 | 28 | 0.6% |
| 1010 | 26 | 0.6% |
| 1940 | 26 | 0.6% |
| 2120 | 24 | 0.5% |
| 2550 | 24 | 0.5% |
| 1700 | 23 | 0.5% |
| 2670 | 23 | 0.5% |
| 1200 | 23 | 0.5% |
| 1590 | 23 | 0.5% |
| Other values (649) | 4352 |
| Value | Count | Frequency (%) |
| 370 | 1 | |
| 380 | 1 | |
| 420 | 1 | |
| 430 | 1 | |
| 490 | 1 | |
| 520 | 1 | |
| 550 | 1 | |
| 560 | 1 | |
| 580 | 1 | |
| 590 | 2 |
| Value | Count | Frequency (%) |
| 17670 | 1 | |
| 14460 | 1 | |
| 12400 | 1 | |
| 11220 | 1 | |
| 9780 | 1 | |
| 9040 | 1 | |
| 8980 | 1 | |
| 8630 | 1 | |
| 8550 | 1 | |
| 8330 | 1 |
Interactions
Correlations
| Age | Total_sqft | bathrooms | bedrooms | condition | floors | price | sqft_above | sqft_basement | sqft_living | sqft_lot | view | waterfront | yr_built | yr_renovated | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | -0.191 | -0.530 | -0.160 | 0.265 | -0.538 | -0.084 | -0.460 | 0.212 | -0.322 | 0.012 | 0.053 | 0.027 | -1.000 | 0.315 |
| Total_sqft | -0.191 | 1.000 | 0.675 | 0.631 | 0.030 | 0.217 | 0.602 | 0.628 | 0.593 | 0.943 | 0.288 | 0.210 | 0.235 | 0.191 | -0.085 |
| bathrooms | -0.530 | 0.675 | 1.000 | 0.538 | 0.129 | 0.540 | 0.492 | 0.696 | 0.190 | 0.747 | 0.092 | 0.146 | 0.166 | 0.530 | -0.213 |
| bedrooms | -0.160 | 0.631 | 0.538 | 1.000 | 0.066 | 0.220 | 0.338 | 0.533 | 0.248 | 0.652 | 0.238 | 0.086 | 0.000 | 0.160 | -0.056 |
| condition | 0.265 | 0.030 | 0.129 | 0.066 | 1.000 | 0.186 | 0.000 | 0.108 | 0.117 | 0.046 | 0.052 | 0.027 | 0.000 | 0.265 | 0.217 |
| floors | -0.538 | 0.217 | 0.540 | 0.220 | 0.186 | 1.000 | 0.321 | 0.604 | -0.288 | 0.397 | -0.204 | 0.033 | 0.000 | 0.538 | -0.229 |
| price | -0.084 | 0.602 | 0.492 | 0.338 | 0.000 | 0.321 | 1.000 | 0.534 | 0.237 | 0.631 | 0.075 | 0.094 | 0.226 | 0.084 | -0.071 |
| sqft_above | -0.460 | 0.628 | 0.696 | 0.533 | 0.108 | 0.604 | 0.534 | 1.000 | -0.172 | 0.843 | 0.305 | 0.102 | 0.134 | 0.460 | -0.169 |
| sqft_basement | 0.212 | 0.593 | 0.190 | 0.248 | 0.117 | -0.288 | 0.237 | -0.172 | 1.000 | 0.323 | 0.023 | 0.195 | 0.211 | -0.212 | 0.054 |
| sqft_living | -0.322 | 0.943 | 0.747 | 0.652 | 0.046 | 0.397 | 0.631 | 0.843 | 0.323 | 1.000 | 0.325 | 0.173 | 0.269 | 0.322 | -0.127 |
| sqft_lot | 0.012 | 0.288 | 0.092 | 0.238 | 0.052 | -0.204 | 0.075 | 0.305 | 0.023 | 0.325 | 1.000 | 0.049 | 0.000 | -0.012 | 0.051 |
| view | 0.053 | 0.210 | 0.146 | 0.086 | 0.027 | 0.033 | 0.094 | 0.102 | 0.195 | 0.173 | 0.049 | 1.000 | 0.483 | 0.055 | 0.050 |
| waterfront | 0.027 | 0.235 | 0.166 | 0.000 | 0.000 | 0.000 | 0.226 | 0.134 | 0.211 | 0.269 | 0.000 | 0.483 | 1.000 | 0.026 | 0.000 |
| yr_built | -1.000 | 0.191 | 0.530 | 0.160 | 0.265 | 0.538 | 0.084 | 0.460 | -0.212 | 0.322 | -0.012 | 0.055 | 0.026 | 1.000 | -0.315 |
| yr_renovated | 0.315 | -0.085 | -0.213 | -0.056 | 0.217 | -0.229 | -0.071 | -0.169 | 0.054 | -0.127 | 0.051 | 0.050 | 0.000 | -0.315 | 1.000 |
Missing values
Sample
| date | price | bedrooms | bathrooms | sqft_living | sqft_lot | floors | waterfront | view | condition | sqft_above | sqft_basement | yr_built | yr_renovated | Age | Total_sqft | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2014-05-02 00:00:00 | 313000.0 | 3.0 | 1.50 | 1340 | 7912 | 1.5 | 0 | 0 | 3 | 1340 | 0 | 1955 | 2005 | 69 | 1340 |
| 1 | 2014-05-02 00:00:00 | 2384000.0 | 5.0 | 2.50 | 3650 | 9050 | 2.0 | 0 | 4 | 5 | 3370 | 280 | 1921 | 0 | 103 | 3930 |
| 2 | 2014-05-02 00:00:00 | 342000.0 | 3.0 | 2.00 | 1930 | 11947 | 1.0 | 0 | 0 | 4 | 1930 | 0 | 1966 | 0 | 58 | 1930 |
| 3 | 2014-05-02 00:00:00 | 420000.0 | 3.0 | 2.25 | 2000 | 8030 | 1.0 | 0 | 0 | 4 | 1000 | 1000 | 1963 | 0 | 61 | 3000 |
| 4 | 2014-05-02 00:00:00 | 550000.0 | 4.0 | 2.50 | 1940 | 10500 | 1.0 | 0 | 0 | 4 | 1140 | 800 | 1976 | 1992 | 48 | 2740 |
| 5 | 2014-05-02 00:00:00 | 490000.0 | 2.0 | 1.00 | 880 | 6380 | 1.0 | 0 | 0 | 3 | 880 | 0 | 1938 | 1994 | 86 | 880 |
| 6 | 2014-05-02 00:00:00 | 335000.0 | 2.0 | 2.00 | 1350 | 2560 | 1.0 | 0 | 0 | 3 | 1350 | 0 | 1976 | 0 | 48 | 1350 |
| 7 | 2014-05-02 00:00:00 | 482000.0 | 4.0 | 2.50 | 2710 | 35868 | 2.0 | 0 | 0 | 3 | 2710 | 0 | 1989 | 0 | 35 | 2710 |
| 8 | 2014-05-02 00:00:00 | 452500.0 | 3.0 | 2.50 | 2430 | 88426 | 1.0 | 0 | 0 | 4 | 1570 | 860 | 1985 | 0 | 39 | 3290 |
| 9 | 2014-05-02 00:00:00 | 640000.0 | 4.0 | 2.00 | 1520 | 6200 | 1.5 | 0 | 0 | 3 | 1520 | 0 | 1945 | 2010 | 79 | 1520 |
| date | price | bedrooms | bathrooms | sqft_living | sqft_lot | floors | waterfront | view | condition | sqft_above | sqft_basement | yr_built | yr_renovated | Age | Total_sqft | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4590 | 2014-07-08 00:00:00 | 380680.555556 | 4.0 | 2.50 | 2620 | 8331 | 2.0 | 0 | 0 | 3 | 2620 | 0 | 1991 | 0 | 33 | 2620 |
| 4591 | 2014-07-08 00:00:00 | 396166.666667 | 3.0 | 1.75 | 1880 | 5752 | 1.0 | 0 | 0 | 4 | 940 | 940 | 1945 | 0 | 79 | 2820 |
| 4592 | 2014-07-08 00:00:00 | 252980.000000 | 4.0 | 2.50 | 2530 | 8169 | 2.0 | 0 | 0 | 3 | 2530 | 0 | 1993 | 0 | 31 | 2530 |
| 4593 | 2014-07-08 00:00:00 | 289373.307692 | 3.0 | 2.50 | 2538 | 4600 | 2.0 | 0 | 0 | 3 | 2538 | 0 | 2013 | 1923 | 11 | 2538 |
| 4594 | 2014-07-09 00:00:00 | 210614.285714 | 3.0 | 2.50 | 1610 | 7223 | 2.0 | 0 | 0 | 3 | 1610 | 0 | 1994 | 0 | 30 | 1610 |
| 4595 | 2014-07-09 00:00:00 | 308166.666667 | 3.0 | 1.75 | 1510 | 6360 | 1.0 | 0 | 0 | 4 | 1510 | 0 | 1954 | 1979 | 70 | 1510 |
| 4596 | 2014-07-09 00:00:00 | 534333.333333 | 3.0 | 2.50 | 1460 | 7573 | 2.0 | 0 | 0 | 3 | 1460 | 0 | 1983 | 2009 | 41 | 1460 |
| 4597 | 2014-07-09 00:00:00 | 416904.166667 | 3.0 | 2.50 | 3010 | 7014 | 2.0 | 0 | 0 | 3 | 3010 | 0 | 2009 | 0 | 15 | 3010 |
| 4598 | 2014-07-10 00:00:00 | 203400.000000 | 4.0 | 2.00 | 2090 | 6630 | 1.0 | 0 | 0 | 3 | 1070 | 1020 | 1974 | 0 | 50 | 3110 |
| 4599 | 2014-07-10 00:00:00 | 220600.000000 | 3.0 | 2.50 | 1490 | 8102 | 2.0 | 0 | 0 | 4 | 1490 | 0 | 1990 | 0 | 34 | 1490 |